A Novel Fuzzy Clustering Method for Outlier Detection in Data Mining
نویسنده
چکیده
In data mining, the conventional clustering algorithms have difficulties in handling the challenges posed by the collection of natural data which is often vague and uncertain. Fuzzy clustering methods have the potential to manage such situations efficiently. This paper introduces the limitations of conventional clustering methods through k-means and fuzzy c-means clustering and demonstrates the drawbacks of the algorithms in handling outlier points. In this paper, we propose a new fuzzy clustering method which is more efficient in handling outlier points than conventional fuzzy c-means algorithm. The new method excludes outlier points by giving them extremely small membership values in existing clusters while fuzzy c-means algorithm tends give them outsized membership values. The new algorithm also incorporates the positive aspects of k-means algorithm in calculating the new cluster centers in a more efficient approach than the c-means method.
منابع مشابه
Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means
One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...
متن کاملImplementation of Fuzzy c-Means and Outlier Detection for Intrusion Detection with KDD Cup 1999 Data Set
In this paper, a two-phase method for computer network intrusion detection is proposed. In the first phase, a set of patterns (data) are clustered by the fuzzy c-means algorithm. In the second phase, outliers are constructed by a distance-based technique and a class label is assigned to each pattern. The KDD Cup 1999 data set is used for the experiment. The results show that, for binary classif...
متن کاملSupport Vector Clustering for Outlier Detection
In this paper a novel Support vector clustering(SVC) method for outlier detection is proposed. Outlier detection algorithms have application in several tasks such as data mining, data preprocessing, data filter-cleaner, time series analysis and so on. Traditionally outlier detection methods are mostly based on modeling data based on its statistical properties and these approaches are only prefe...
متن کاملOil Reservoirs Classification Using Fuzzy Clustering (RESEARCH NOTE)
Enhanced Oil Recovery (EOR) is a well-known method to increase oil production from oil reservoirs. Applying EOR to a new reservoir is a costly and time consuming process. Incorporating available knowledge of oil reservoirs in the EOR process eliminates these costs and saves operational time and work. This work presents a universal method to apply EOR to reservoirs based on the available data by...
متن کاملAn Efficient Clustering and Distance Based Approach for Outlier Detection
Outlier detection is a substantial research problem in the domain of data mining that aims to uncover objects which exhibit significantly different, exceptional and inconsistent from rest of the data. Outlier detection has been widely researched and finds use within various application domains including tax fraud detection, network robustness analysis, network intrusion and medical diagnosis. I...
متن کامل